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Since roughly a decade ago, network science has focussed among others on the problem of how the 
spreading of diseases depends on structural patterns. Here, we contribute to further advance our 
understanding of epidemic spreading processes by proposing a non-perturbative formulation of the 
heterogeneous mean field approach that has been commonly used in the physics literature to deal 
with this kind of spreading phenomena. The non-perturbative equations we propose have no as- 
sumption about the proximity of the system to the epidemic threshold, nor any linear approximation 
of the dynamics. In particular, we first develop a probabilistic description at the node level of the 
epidemic propagation for the so-called susceptible-infected-susceptible family of models, and after we 
derive the corresponding heterogeneous mean-field approach. We propose to use the full extension 
of the approach instead of pruning the expansion to first order, which leads to a non-perturbative 
formulation that can be solved by fixed point iteration, and used with reliability far away from the 
epidemic threshold to assess the prevalence of the epidemics. Our results are in close agreement 
with Monte Carlo simulations thus enhancing the predictive power of the classical heterogeneous 
mean field approach, while providing a more effective framework in terms of computational time. 

PACS numbers: 89.75.Fb, 89.75. He, 89.75.Da 



I. INTRODUCTION 

During the last decade, the physicists' community 
working on the theory of complex networks has paid spe- 
cial attention to the problem of epidemic spreading in 
social [l| , biological [|f{5[ and technological networks @ . 
The development of mathematical and computational 
models to guide our understanding of the disease dynam- 
ics has allowed to address important issues such as the 
influence of diverse contact patterns and also new algo- 
rithms for immunization and vaccination policies 0, ■ 
Physicist's approaches to problems in epidemiology in- 
voke statistical physics, the theory of phase transitions 
and critical phenomena [9] , to grasp the macroscopic be- 
havior of epidemic outbreaks [l|, llOl - uq . It is not ad- 
venturous to claim that one of the main artifices of this 
success has been the Mean-Field (MF) approximation, 
where homogeneity and isotropy are hypothesized to re- 
duce the complexity of the system under study. 

On the other hand, the study of the topological prop- 
erties of complex networks [17H19I ] has provided new 
grounds to the understanding of contagion dynamics. 
Particularly widespread in nature are heavy tailed degree 
distributions, specially scale-free (SF) networks, whose 
degree distribution follows a power law P(k) ~ k 1 for 
the number of connections, k, an individual has. SF net- 
works include patterns of sexual contacts [20|, the Inter- 
net Q, as well as many other social, technological and 
biological networks [2l[ . The critical properties of an epi- 
demic outbreak in SF networks were addressed using the 



heterogeneous mean-field (HMF) prescription (also called 
correlated mean field) [l|, \MMM- HMF coarse-grams 
vertices within degree classes and hypothesizes that all 
nodes in a degree class have the same dynamical prop- 
erties; the ap pro ach also assumes that fluctuations can 
be neglected [1(|. This framework has been proved to 
be exact in annealed networks, whose nodes' degrees are 
sampled from a fixed degree distribution at each step of 
the dynamics [l8| (i.e. its specific connectivity is fixed 
only in average). However, in specific realizations of a 
model's topology (quenched networks), HMF can result 
in different levels of accuracy [22| . This problem leads to 
the question of whether or not the direct use of the HMF 
approach is accurate enough when dealing with real net- 
works (i.e., in an instance of a network ensemble). In 
a recent work, Gleeson et al. [23| studied how accurate 
the HMF can be on 21 real- world networks and found 
some relationships between the predictive accuracy and 
topological properties of the networks. Also other works 
have addressed this problem using pair approximations, 
in particular Miller [24[ has found analytical results for 
a special class of clustered networks. 

Although the HMF approach has been extremely use- 
ful to assess the critical properties of epidemic models, 
it is not thought to give information of individual nodes 
but of classes of nodes of a given degree. Then, asking 
about the probability that a given node is infected is not 
well-posed in this framework. To obtain more details at 
the individual (node) level of description, usually one has 
to rely on Monte Carlo (MC) simulations of the actual 
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dynamics, which have also been used to validate the re- 
sults obtained using HMF methods. The current theory 
concentrates in two specific situations, the contact pro- 
cess [26l - l3i| (CP) and the fully reactive process |32| - [34j 
(RP). A CP stands for a dynamical process that involves 
an individual stochastic contagion per infected node per- 
unit time, while in the RP there are as many stochastic 
contagions per unit time as neighbors a node has. How- 
ever, in real situations, the number of stochastic contacts 
per unit time is surely a variable of the problem itself (35| . 

Recently, some of us have proposed an alternative 
approach to study the Susceptible-Infected-Susceptiblc 
(SIS) model [l-Q considering the number of stochastic 
contacts as a parameter that defines a whole family of 
SIS models [3a], the so-called Microscopic Markov-Chain 
Approach (MMCA). This more realistic scenario allows 
to characterize the prevalence of the disease at the level 
of individual nodes when the number of contacts inter- 
polates between the two limiting cases of CP and RP. 
Capitalizing on the MMCA framework, we propose here a 
non-perturbative HMF formulation which does not prune 
the equations at first order in the prevalence density, but 
considers the whole series expansion. The resulting equa- 
tions can be solved using fixed point iteration, and are ac- 
curate in predicting the epidemic incidence for the whole 
phase diagram, even out of the critical transition region. 



II. DISCRETE-TIME AND 
CONTINUOUS-TIME SIS MODELS IN 
NETWORKS 

The analysis of epidemic spreading models in net- 
works has some of the same difficulties that are found for 
well-mixed populations (equivalent to complete graphs). 
These difficulties are intrinsic to the discrete-time or 
continuous-time formulation of the governing equations, 
and the methods used to solve each of them. Continu- 
ous approximations have been more popular in epidemic 
modeling because of their mathematical tractability, and 
the avoidance of chaotic behaviors that can arise in their 
discrete counterparts [131 • For the sake of clarity, we 
henceforth fix our attention on the study of a family of 
SIS models. In a SIS model, individuals that are cured 
do not develop permanent immunity but are immediately 
susceptible to the disease again. In well-mixed popula- 
tions, the differential equations governing the number of 
susceptible (S) and infected (I) individuals are 



infected individual recovers. Their corresponding differ- 
ence equations are 
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where N = S(t) + 7(4) is the (constant) size of the pop- 
ulation. The term I/N accounts for the probability of 
contacting an infected individual in a well-mixed popu- 
lation of size N, p is the infectivity rate (probability per 
unit time) for each contact, and p is the rate at which one 
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7(4 + At) = J(t) - pAtI{t) + pAt^-[N - 7(4)] . (3) 

Defining p(t) = I(t)/N as the fraction of infected indi- 
viduals in the population, Eq. ([3]) is written as 

p(t + At) = p(t) - p,At p(t) + f3At p{t) [1 - p{t)] . (4) 

Note that while the system of Eqs. (HJ always converges 
to a solution [37], [H[ , Eq. can be mapped to a logistic 
function when the reproductive ratio 1Z — P/p > 1, thus 
giving rise to basic periodicity, bifurcations and chaotic 
behavior depending on the parameters [37j • Although 
both descriptions are equivalent in the limit At — > 0, 
differences arise when considering a finite At. 

Particularly interesting is what happens when consid- 
ering a numerical scheme iterating Eq. (U). In many 
cases A4 is usually assimilated to the stochastic simula- 
tion time unit and set to 1. Consequently, the numerical 
differences with the continuous case are substantial. It is 
also important to distinguish between rates and proba- 
bilities, pAt — P is a probability, and the same holds for 
pAt = p. Again, by setting A4 = 1, one can mix up rate 
and probabilities because both will have the same values, 
even though their units are different. 

The mapping of the above SIS equations to the case of 
heterogeneous networks is not straightforward and has its 
critical step in the redefinition of the probability of con- 
tacts. In a network, the number of contacts is restricted 
to a fixed neighborhood, then each individual (node) can 
potentially contact only its neighbors. In the seminal 
work by Pastor-Satorras and Vespignani (io| . these au- 
thors proposed the direct use of Eq. ((4]) for classes of 
nodes based on their degree k. This is the root of the 
HMF approach in complex networks and the hypothesis 
of homogeneity is here postulated at the level of classes 
of nodes. The rationale behind this assumption is that 
the dynamical behavior of any two nodes with the same 
degree k will be essentially the same. Then, a system of 
equations for each class k is written as 

p k (t+At) = Pk {t)-pAtp k (t)+pAte k (t)[l-p k (t)} , (5) 

where now p k {t) stands for the fraction of infected in- 
dividuals of degree k, and the probability of contacting 
an infected node is encoded in the new function 9^(4). 
For the general case of correlated networks, the function 
0/c(4) takes the form 



e k {t) = J2 p (k'\k) 



(6) 
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where P(k'\k) is the probability that a node of degree k 
connects to a node of degree k' . Eq.© is used to find 
the stationary value of the incidence for given values of 
(3 and /i. Indeed, for the stationary state, it is true that 
the only dependence to take into account is the ratio 
A = PI p, because the equations can be rescaled without 
modifying their solutions. However, this is not anymore 
the case during the transient. The critical value A c for 
the epidemic threshold was found in [25| to be 



A, = 



Amax(C) 



(7) 



being A max (C) the largest eigenvalue of the connectivity 
matrix of classes of nodes C, whose components are given 
by Ckk' = kP{k'\k), i.e. the expected number of links 
from a node of degree k to nodes of degree kl . 



III. THE MICROSCOPIC MARKOV-CHAIN 
APPROACH TO SIS MODELS 

In this section, we summarize an alternative approach 
to describe the equations governing the SIS class of mod- 
els [36[ . We henceforth refer to this formulation as Mi- 
croscopic Markov- Chain Approach (MMCA). The main 
advantage of this approach is that it is able to deal with 
the infection dynamics at the level of single nodes. Specif- 
ically, we focus on the probability of a node to be infected 
at time t. For each node i, we construct a Markov chain 
that accounts for the probability of infection pi{t), as- 
suming that the number of contacts of node i with its 
neighbors is parameterized by an integer rj, and that the 
infection events are uncorrelated. Note that this consti- 
tutes a first principles derivation of a discrete model, not 
a discretization of a differential equation. We consider 
two different situations of interest: i) without reinfec- 
tions (WOR), and ii) with reinfections (WIR). The first 
case implies that the time scales for the infection and cure 
are well separated, whereas the latter assumes that the 
same time scale holds for infection and cure and therefore 
a just recovered individual might catch the disease again 
within the same time step t. The respective equations 
are: 



WOR: 

Pi (t + 1) = [1-J>i(t)][l 



?<(*)] + (1-M)P*(*)> (8) 



already presented in Eqs. ((2J for the well-mixed case, the 
explanation of these equations is straightforward. The 
terms in the r.h.s. of Eq. © account respectively for the 
probability that a susceptible node [1 — Pi (t)] is infected 
by at least one neighbor [1 — <&(£)], and an infected node 
does not recover [(1 — /x)p,(t)]. Eq. © adds, after some 
algebra, a term that accounts for the probability that 
an infected node recovers [fipi(t)] but gets infected again 
by a neighbor [1 — ft(t)] in the same time step. Finally, 
in Eq. (fTUf , we have the probabilities that infected nodes 
[pj{t)] contact node i, and that these contacts lead to new 
infections, which occur with probability /?. The values of 
the contact probabilities r,-j can be expressed as 



W-j 



(11) 



for weighted networks, and 
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k 

for unweighted networks, where 

R v (x) = l-(l-xf . 



(13) 



For the contact process, r\ — 1 and R\{x) — x, whereas 
for the fully reactive process, rj — > oo and R oa (x) = 1 

At the stationary state, Eqs. © and © are indepen- 
dent of the discrete time-step, and simplify to 

WOR: Pi = (1 - Pi )(l - ft) + (1 - rfpi , (14) 
WIR: pi = (1 - q t ) + (1 - fi) Piqi , (15) 



with 
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qi = n^ 1 ~^ r i^) 

3=1 



(16) 



These equations are easily solved by fixed point iteration 
until a fixed point (for the WIR and WOR cases) or a 
cycle (for the WOR setup) is found. In this second case, 
averaging over the oscillating values of the quantity of 
interest must be considered. Finally, the average fraction 
of infected nodes in the stationary state is given by 



1 N 

i=l 



(17) 



WIR: 

Pi {t + 1) = [1 - q t (t)} + (1 - n)pi(t) qi (t) , (9) 

where the probability qi(t) of node i not being infected 
by any neighbor is 



<n(t) = lill-Pr ji p j (t)] 

3=1 



(10) 



Keeping in mind the separation of the two processes, 
namely, contacting a node and transmitting the infection, 



IV. NON-PERTURBATIVE HMF 

In heterogeneous mean field theory it is supposed that 
all nodes of the same degree behave equally. In terms 
of the MMCA formulation this means that pi — pj if 
ki = kj , and the density pk of infected nodes of degree k 
is given by 



Pk = 



-E 



Pj = Pi , 



VieK, 



(18) 
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where K is the set of nodes with degree fc, whose cardi- 
nality is denoted by N k . This notation allows to group 
together terms according to the degrees of the nodes. For 
instance, if the degree of node i is ki = k, then 

X! =EE a ^P k ' = Yl Pk ' X 

j k' jEK' fe' j£K' 

= J2Pk'°kk> =kY, p (k'\k) Pk , , (19) 

k> k' 

where Ckk' = kP{k'\k) is the expected number of links 
from a node of degree k to nodes of degree fe', as explained 
in Sect. HB 

Substitution of the HMF approximation, Eq. (fl"8|) . into 
the MMCA Eqs. CE]) and JSJ leads to 

WOR: Pfc = (l-p fc )(l-?fc) + (l-M)pfc, (20) 
WIR: p k = (1 - %) + (1 - A*)p fc ft , (21) 

which can also be written as 

WOR: = -p J p k + (l-p k ){l-q k ), (22) 
WIR: = -fxp k + (l-(l-fi) Pk )(l-q k ). (23) 

These equations constitute the HMF approximations of 
MMCA for the SIS model without and with reinfections, 
respectively. 

We still need a HMF expression for q k . For unweighted 
networks the value of qi is 



A' 



3=1 3=1 

If node i has degree ki = k, then 

4 = fe = Iin i 1 - PajiR^k'-^pv) , (25) 
fe' jeK' 

where we have grouped together the terms in the product 
by their degrees k' as in Eq. (p~9|) . The expression within 
the parentheses is equal to 1 for all nodes of degree k' 
that are not connected to node i (ayj = 0), and equal to 
[1 — f3R n (k'~ 1 )pk>] for nodes of degree k' that are linked 
to i (dji = 1). Besides, the expected number of such 
terms is Ckk' ■ Hence, we obtain 



q k = H(l-pR v {k'- 1 )p k , 



(26) 



Eqs. and (gSJ), together with Eq. ([25]). form what 
we call the Non-perturbative Heterogeneous Mean Field 
(npHMF) equations of the SIS model in unweighted net- 
works. Note that in the derivation of the npHMF equa- 
tions no assumption has been made about the proxim- 
ity of the system to the epidemic threshold, where the 
epidemic prevalence is small, nor any linear approxima- 
tion has been invoked, hence the qualification of "non- 
perturbative" . 



The solution of the npHMF equations follows the same 
steps as in MMCA, i.e., Eqs. ([2P]l, (j2"Tj) and d2SJ) are iter- 
ated until a fixed point (WIR, WOR) or a cycle (WOR) is 
found. As before, for the latter case, we take the average 
of oscillating values for the disease prevalence. Finally, 
the global epidemic prevalence is given by 



1 X ^ 

x2^N kPk 



(27) 



It is easy to show that the standard HMF equations 
[25l [3l| are just a linear approximation of our WOR 
npHMF equations. Near the epidemic threshold /3 C , 
where P k <C 1, we get 



q k - 1- pY^Ckk'Rnik'-^pki 

k' 

= 1- pk^Pik'^R^k'-^pk 



(28) 



which can be inserted into Eq. (f2"12j) to give 

= -fipk + Pk{l - p k ) P{k'\k)R n {k , - x )p k , , (29) 



where 



1 for the RP, 

^ — for the CP. 

k' 



(30) 



A. Epidemic threshold 

To round off our analysis, we derive the critical spread- 
ing threshold using the different approaches here dis- 
cussed. In [36| it is shown that MMCA allows the de- 
termination of the epidemic threshold 



A m ax(-R) 



(31) 



where A max (i?) is the maximum eigenvalue of the matrix 
R of the contact probabilities ry. In particular, 



. for the RP, 

f3 c = { A max (A) 

H for the CP. 



(32) 



These results hold both with and without reinfections, 
since at first order, Eqs. © and © coincide. In the same 
way, the critical points from npHMF Eqs. (|22l) and ([23]) 
are the same as in standard HMF, where the matrix H 
with elements h kk t — Ckk' Rri{k'~ l ) replaces matrix R, 



P, 



HMF 



A ma x(-ff) 



(33) 



with the well-known particular cases 



jHMF 



t — for the RP, 

A ma x(C) (34) 

a for the CP. 
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FIG. 1: Epidemic prevalence p as a function of the infec- 
tion rate /3. The different curves correspond (as indicated) to 
Monte Carlo results and the numerical solutions of the fixed 
point equations of the different approaches (MMCA, HMF 
and npHMF) for a SF network made up of N = 10 4 nodes 
using the configuration model. The exponent of the degree 
distribution is 7 = 2.7. The contact probabilities used corre- 
spond to the case of a fully reactive process and p has been 
set to 0.5. 



The comparison with Monte Carlo (MC) simulations 
shows that MMCA provides a better approximation to 
the epidemic threshold than HMF, since no information 
of the original network structure is lost. Additionally, 
by the definition of the contact probabilities, the MMCA 
is applicable to weighted and directed networks without 
further modifications of the equations. This is not the 
case when one deals with HMF in networks that are not 
unweighted and undirected. 



V. COMPARISON WITH MONTE CARLO 
SIMULATIONS 

To compare the results coming out from the different 
approaches, we have performed MC simulations of the 
disease dynamics on top of scale-free networks generated 
using the uncorrelated configuration model. Besides, nu- 
merical solutions of the fixed point equations have also 
been obtained. For high values of p, and /?, MMCA with- 
out reinfections does not converge to a fixed point as the 
rest, but it converges to an oscillation between two states, 
from which the average is taken. These oscillations are 
also present in the MC runs, and dissapear after aver- 
aging over multiple runs. On the other hand, standard 
HMF cannot be solved by iteration, since it diverges even 
for small values of (3 ~ lO -1 . Therefore, we have made 
use of a nonlinear minimization algorithm based on a 
successive quadratic programming solver. 

Figure Q] shows the comparison between all the ap- 
proaches discussed throughout this paper for the case 



FIG. 2: Fraction p^ of infected individuals of degree k from 
npHMF versus the infection probabilities pi (black) and their 
average {pi)k over nodes with the same degree (blue) obtained 
from MC. The epidemic model is a RP with reinfections with 
p = 0.5 and f3 = 0.3. For the sake of clarity, we have used in 
this case an SF network of 500 nodes with 7 = 2.7. 



in which a fully reactive process is considered. As can 
be seen, the worst performance with respect to MC sim- 
ulations corresponds to the HMF, which correctly pre- 
dicts the epidemic threshold but fails to reproduce the 
evolution of the epidemic incidence as the infection rate 
increases and moves away from the critical point. The fig- 
ure also shows that the npHMF approximation behaves 
only slightly worse than the Markov Chain formulation. 
More important, in addition to correctly capturing the 
critical epidemic threshold and at variance with the stan- 
dard HMF, it allows to study the whole phase diagram 
whatever the value of the infection rate is. 

To provide further evidences of the validity of the 
npHMF, we have represented in Fig [5] the epidemic inci- 
dence of nodes of degree pk as given by the solution of the 
npHMF equations as a function of the probability that 
a node i, whose connectivity is k, is infected, being the 
latter obtained from MC simulations. Besides, the aver- 
age over all the p, 's for nodes of the same degree has also 
been represented. As can be seen, all individual proba- 
bilities are distributed around the mean value, which in 
turns is close to the values coming out from the npHMF 
formulation. These results further illustrate that the be- 
havior is not the same along all the connectivity classes: 
for large prevalence densities (or probability of being in- 
fected), the dispersion around the mean values shrinks. 
In other words, the degree of accuracy in the prediction 
of the state of a given node i with respect to whether it 
is infected or not depends on pf. : for large values of this 
density, one can predict with high accuracy the individ- 
ual probabilities pi for nodes of degree k, while if pk is 
relatively small, the prediction error is larger due to the 
more pronounced deviations from the average value. 

Finally, we have also tested the new formulation 
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FIG. 3: Epidemic prevalence p as a function of the infection 
rate ft for different values of n as indicated. The top panel 
corresponds to the numerical solutions of the fixed point equa- 
tions of the MMCA formulation, while the bottom figure has 
been obtained using the npHMF equations. In both cases, 
the numerical solutions are compared with the results of MC 
simulations on top of a SF network made up of N = 10 4 
nodes. The exponent of the degree distribution is 7 = 2.7. 
The contact probabilities used correspond to the case of a 
fully reactive process. 



against variation of the recovery rate. The results ob- 
tained for different values of /1 using the MMCA and the 
npHMF approaches are compared with MC simulations 
in Fig. [3] As in Fig.[TJ we have prescribed a fully reactive 
process on top of an undirected and unweighted scale- 
free network. Moreover, the results shown correspond to 
the case in which reinfections are possible (similar results 
are found for the case without reinfections). The differ- 



ent curves show that both formulations perform quite 
well, being however the MMCA slightly better that the 
npHMF case. 

VI. CONCLUSIONS 

In summary, in this paper we have proposed a new 
kind of heterogeneous mean field approach to describe 
the spreading of diseases in complex networks. Capital- 
izing on a previous formulation aimed at computing the 
individual probabilities for nodes to be infected, we have 
coarse grained the dynamical equations by degree classes. 
In doing so, we have been able to keep higher order terms, 
which ultimately allow us to use the resulting fixed point 
equation to obtain numerical solutions that are in good 
agreement with MC simulations. However, although the 
results outperform those obtained through the standard 
mean-field approach, the new formulation is still limited 
with respect to individual-based approaches (such as the 
MMCA). The reason is that, when coarse-graining the 
dynamics, some information of the original network is 
lost and exact quantities are replaced by expected val- 
ues. The ultimate consequence is that, although one can 
accurately reproduce the behavior of global dynamical 
descriptors like the epidemic prevalence, at the individual 
level the probabilities of infection are more error-prone. 
Having said that, we however think that the approach 
proposed here represents a significant improvement on 
the standard HMF framework in several aspects. First, 
it is more accurate. Secondly, it allows to solve the whole 
phase diagram, which opens up the possibility of getting 
fast numerical solutions given a network topology with- 
out resorting to MC simulations. 
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